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Abstract. The paper introduces a new technique for compressing 
Binary Decision Diagrams in those cases where random access is not 
required. Using this technique, compression and decompression can 
be done in linear time in the size of the BDD and compression will 
in many cases reduce the size of the BDD to 1-2 bits per node. 

Empirical results for our compression technique are presented, in- 
cluding comparisons with previously introduced techniques, show- 
ing that the new technique dominate on all tested instances. 



1 Introduction 



In this paper we introduce a technique for compressing binary de- 
cision diagrams for those cases where random access to the com- 
pressed representation is not needed. We will use the term offline 
to describe a BDD stored in such a manner that it no longer allows 
random access to its structure without decompressing the BDD first. 
The two primary areas in which decision diagrams are used in prac- 

' tice are verification and configuration. In both of these areas it is 
sometimes important to store binary decision diagrams offline us- 

" ing as little space as possible. Primarily the need for such compres- 
sion arises when it is necessary to transmit binary decision diagrams 
across communication channels with limited bandwidth. In the area 
of verification this need arises for example when using a networked 
cluster of computers to perform a distributed compilation of a binary 
decision diagram (T). A similar exchange of BDD data takes place 
in distributed configuration as described in 1131 . In such approaches 
the fact that the network bandwidth is much lower than the memory 
bandwidth can become a major bottleneck as computers stall waiting 
to receive data to process. Transmitting the binary decision diagrams 
in a compressed representation can help alleviate this problem. The 
same problem arises in standard interactive configuration. Consider 
for example a web-based interactive configurator that uses BDDs to 
store the set of valid configurations. It must either transmit the BDD 
storing the configuration data to the customers computer or perform 
all computations on the server leading to a network delay each time 
the user makes a choice during the configuration. In order to reduce 
the required bandwidth as well as lower the load time in the first case, 
it is benefical to transmit the BDD in a compressed format. 

The paper is organized as follows: In Section [2] we present the 
neccessary notation and definitions. In Section[3]we present our com- 
pression scheme. Finally, in Section [4] we show the compression at- 
tained by applying our compression scheme on different instances. 
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1.1 Related work 

Most of the work done on compressing decision diagrams aim to 
achieve large reductions in size, while maintaining random access, 
by means of better variable orderings |7'|[3| or modifications to the 
decision diagram data structure |4|. Such work is mostly orthogonal 
to the aim of this paper, as the compression strategy we present easily 
can be adapted for use with the variations over basic BDDs. 

The only previous work we are aware of for compressing BDDs 
for offline storage is the work by Starkey and Bryant[H| and the 
follow-up paper by Mateu and Prades-Nebot| 8 1 which both de- 
scribes techniques for image compression using BDDs. The latter 
of these includes a non-trivial encoding algorithm for storing the 
BDD offline. Finally Kieffer et.al|5| gives theoretical results for us- 
ing BDDs for general data compression including a technique for 
storing BDDs. After presenting our own technique, we present em- 
pirical results comparing the new encoder with the encoders from 1 8 1 
and 0. 

2 Preliminaries 

Definition 1 (Ordered Binary Decision Diagram), An ordered 
binary decision diagram on n binary variables X = {xi, . . . , x n } 
is a layered directed acyclic graph G(V, E) with n + 1 layers (some 
of which may be empty) and exactly one root. We use l(u) to denote 
the layer in which the node u resides. In addition the following prop- 
erties must be satisfied: 

• If | V | 7^ 1, there are exactly two nodes in layer n + 1. These nodes 
have no outgoing edges and are denoted the 1 -terminal and the 0- 
terminal. If\V\ = 1 the layer will either contain the 1 -terminal or 
the O-terminal. 

• All nodes in layer 1 ton have exactly two outgoing edges, denoted 
the low and high edge respectively. We use low(u) and high(u) 
to denote the end-point of the low and high edge of u respectively. 

• For any edge (u, v) 6 E it is the case that l(u) < l(v) 

We use Ei ow and E^igh to denote the set of low and high edges 
respectively. An edge (u,v) such that l(u) + 1 < l(v) is called a 
long edge and is said to skip layer l(u) + 1 to l(v) — 1. The length 
of an edge (u, v) is defined as l(v) — l(u). 

Definition 2 (Reduced ordered Binary Decision Diagram). An or- 
dered Binary Decision Diagram is called reduced iff for any two dis- 
tinct nodes u,v it holds that low(u) 7^ low(v)\Jhigh(u) 7^ high(v) 
and further that high(u) 7^ low (u) for all nodes u. 

In the rest of this paper we will assume for all BDDs we are con- 
sidering that they are ordered and reduced. 

Definition 3 (Solution to a BDD). A complete assignment ptoX is 
a solution to an BDD G(V, E) iff there exists a path P from the root 



in G to the 1 -terminal such that for every assignment (xi, b) £ p, 
where b £ {low, high}, there exists an edge (u, v) in P such that 
one of the following holds: 

- l(u) < i < l(v) 

- l(u) — i and (u, v) £ Eb 
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Figure 1. From left to right: A BDD and two different spanning trees on the 
BDD. Solid and dashed edges corresponds to high and low edges respectively 

Example 4. Figure \l(a)\ contains a BDD on the binary vari- 
ables X = {x±, X2, %3, 2:4} which the solutions {(1,1,1,1), 
(1,1,0,1), (1,0,1,0), (1,0,0,0), (0,1,1,0), (0,1,0,1), 
(0,0, 1,0), (0,0, 0,1)}. 

Definition 5 (BFS order). A BFS ordering id b : V -» {1, . . . , |V|} 
of the nodes in a layered DAG G(V, E) rooted in r is the ordering of 
V in the order they are visited by a BFS in the DAG starting at r and 
traversing left edges prior to right edges. 

Definition 6 (Layer order). A layer ordering idi : V —> 
{1, ...,11/1} of the nodes in a layered DAG G(V, E) rooted in r 
is the ordering of V layer by layer in increasing order of the layer. 
Nodes at the same layer are ordered in the order that they are visited 
by a DFS in the DAG starting at r and traversing left edges prior to 
right edges. 

We refer to id b (v) and idi(v) by "the BFS id of v" and "the layer 
id of v" respectively. Note that if all edges in a layered DAG has the 
same length then the ordering idi and idb will be the same. 

In our compression scheme we will make use of the following 
well-known fact: 

Lemma 7. Every binary tree can be unambiguously encoded using 
2 bits pr. node. 

To achieve such an encoding each node v is encoded using two 
bits. The first bit and the second bit is true iff v contains a left and a 
right child respectively. In order to make decoding possible the order 
in which the children of already decoded nodes appear in the encoded 
data must be known. This can for example be ensured by encoding 
the nodes in a DFS or BFS order with either left-first or right-first 
traversal. As an example, the encoding of the nodes of the binary 
tree in Figure \Ucj\ in BFS order is (11, 11,00, 11,00,00,00). 



3 The Compression technique 

Our compression technique can be summarized by the following 
steps: 

1. Build a spanning tree on the BDD (Section[3j}. 

2. Encode edges in the spanning tree, using Lemma|7] 

3. Encode by one bit the order in which the two terminals appear in 
the spanning tree. 

4. Encode the length of the edges in the spanning tree where neces- 
sary (Section[T2]l- 

5. Encode the edges that are not in the spanning tree (Section[33J. 

6. Compress the resulting data using standard compression tech- 
niques. 

The decoder starts by reverting step (6) by decompressing the data. 
It then recreates the binary tree (1-2), restores the correct layer of 
each node (4), and restores the remaining missing edges (5). Below 
we give the details of each step. 

3.1 Building the spanning tree on the BDD 

Definition 8 (Spanning Tree). A spanning tree G T (V T , E T ) on a 

BDD G(V, E) is a subgraph of G, for which V T = V, and any two 
vertices are connected by exactly one path of edges in E . An edge is 
called a tree edge if it is contained in the spanning tree and a nontree 
edge otherwise. 

The most obvious way to construct a spanning tree on a graph is 
to use DFS or BFS. In the case of a rooted DAG one can obtain a 
spanning tree by, for each node v except the root, adding a single 
edge with endpoint in v to the set of tree edges. Two examples of 
spanning trees for the BDD in Figure [T(a)] are shown in Figure [T(b)l 
and[7(c)l 

In our encoder we will construct a spanning tree containing as 
few long edges as possible. Hence when a node v in the BDD 
has multiple parents m, . . . , w/s and we have to choose one of the 
edges (ui,v), . . . , (uk,v) to add to the spanning tree, we will al- 
ways choose the shortest possible edge, that is an edge (114,1)) where 
l(ui) > l(ui), . . . , l(uk). Ties are broken by choosing the edge 
(u, v) with the smallest idi (u). Note that the resulting spanning tree 
is uniquely defined regardless of which order we process nodes in. 
Using this construction we achieve a spanning tree with a minimal 
number of long edges. This can easily be seen by noting that pre- 
cisely one ingoing edge must be chosen for each node, and addition- 
ally that the choice of one edge can never exclude an edge to another 
node from consideration. 

Example 9. The spanning tree in Figure \ljbj\ contains three long 
edges, whereas the spanning tree in Figure [7(c)| only contains one. 
The latter of these would be the one constructed by our encoder upon 
compressing the BDD in Figure \l(a)\ The single long edge in figure 
\l(c)\ has to be included in the tree as it is the only possible way for 
the spanning tree to include the node in layer 1. 

3.2 Encoding the lengths of the tree edges 

The spanning tree is stored as a binary tree where all edges have 
the same length. Since some of the edges in the spanning tree may 
correspond to long edges in the BDD, the binary tree itself may not 
be sufficient to reconstruct the layer information of the nodes during 
decoding. In order to enable the decoder to deduce the correct layer 
we therefore encode the location and the length of each long edge 
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that is included in the spanning tree. The location of a long edge 
(it, v) is uniquely specified by the BFS order of the end point of the 
edge, that is idf,(v). 

When encoding the location of the long edges 
(ill, vi), . . . , (uh, Vk) we will, instead of outputting the inte- 
gers idb(vi), . . . idt(vk), output a bitvector of length |V| for which 
entries idb(yi), . . . , idt(v k ) are true and all other entries are false. 
Though this encoding is likely to require more bits than encoding by 
listing the integers, the bitvector will be compressed very efficiently 
when the standard compression is applied in the final phase. 




Figure 2. A spanning tree on a BDD. The black edges are tree edges and 
the gray edges are nontree edges. The nodes are labeled in layer order 



3.3 Encoding nontree edges 

When the spanning tree and the layer information is encoded, we 
only need to encode the nontree edges, that is, those edges in the 
BDD that are not contained in the spanning tree. 

We know that half of the edges in the BDD will be encoded as 
nontree edges as it follows from the following observation: 

Observation 10. Let G(V, E) be a BDD containing at least 3 nodes. 
Then any spanning tree on G will contain exactly \E\ jl + 1 edges 

Proof. By the assumption that [V| > 2, it holds for any BDD 
G(V,E) that \E\ = 2(\V\ - 2), since all nodes in a BDD except 
the two terminals have two children. Further any tree with j V | nodes 
contains |V| — 1 edges, which equals \E\/2 + 1 edges. □ 

As an example the construction in Figure|2]contains 19 tree edges 
and 17 nontree edges. 

Since every node except the terminals in the BDD has two chil- 
dren and we have the spanning tree available with restored layer 
information, we know the start point of each of the nontree edges 
that has to be added to the spanning tree in order to reconstruct the 
BDD. Hence if we encode the nontree edges in some fixed order 
according to their start point then we only need to encode the end 
point of each of the nontree edges. We will call the endpoint of any 
nontree edge that has yet to be encoded an incomplete child. By 5", 



we will denote the sequence of incomplete children in the order in 
which their parents appear in the layer ordering of the nodes, that is 
S = (vi, ...,v k ) for the nontree edges (ui,Vi), (u k ,v k ). By 
idi(S) we denote the corresponding sequence of layer ids, that is 
idi(S) = (idi(vi), . . . ,idi(v k )). 

We will now describe three encodings of nontree edges in Sec- 
tion |3.3.l1 Section [3.3.2| and S ec tion [3 . 3 . 3 1 which we will combine to 
encode all the nontree edges. 

3.3.1 Incomplete children with large in-degree 

In order for a standard compression technique to successfully com- 
press the sequence idi(S) it is important that the symbols in the se- 
quence appears with high frequency. We note that nodes with in- 
degree d will appear d — 1 times in the sequence of nontree edges. 
Hence by applying standard compression we will be able to effi- 
ciently compress those nontree children that have a high in-degree 
if they are separated from the nodes that have a low in-degree. 
Therefore we split the sequence of nodes appearing as incomplete 
children in S into two disjoint subsequences H and L, the first 
containing those incomplete children that have an in-degree larger 
than a specified threshold in their order of appearence in S, the 
latter containing the rest. For example if the threshold is 3 then 
H = (19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20) for the BDD in 
Figure [2] Based on H we construct the sequence of integers S H on 
the sequence of nodes vi, . . . ,v\v\ m S by encoding Vi £ H as 
idi(vi) and Vi £ L as 0. By the encoded 0s we indicate the in- 
complete children that are not among the incomplete children with 
high in-degree. Note that all integers appearing in S H occurs with 
a high frequency and therefore will compress efficiently using stan- 
dard compression techniques. The remaining incomplete children L, 
we code separately, as described in the next two sections. 

3.3.2 Incomplete children with small in-degree 

Using the above encoding, we are left with a sequence of nontree 
children L, with very few repetitions. When encoding this sequence 
we will exploit the fact that the sequence of integers in idi (L) will in 
most instances tend to be increasing. Below we argue why this is the 
case. 

1 . The first reason is that any node it, with an outgoing edge of length 
k is naturally restricted to the children occurring in layer l(u) + 
k, and therefore to the range of idi indices of the nodes at layer 
l(u) + k. Since most edges in a BDD are usually rather short 
(except those to the terminals), this leads to a natural increasing 
progression in idi {L). 

2. Secondly, when examining a set of layers in a BDD it is very 
common to see disjoint substructures. For example, in Figure 
[2] we have two disjoint substructures induced by the nodes 
2, 4, 5, 8, 9, 10 and 3, 6, 7, 11, 12, 13. Given two disjoint substruc- 
tures then for any given layer I let l[ and b\ be the sequences of 
layer indices of the nodes in that layer from each of them respec- 
tively. Assume for convinience that /} contains the smallest index. 
Then it is the case that all indices in /} are strictly smaller than all 
indices in l\, and furthermore the same applies to I\ and P 2 for 
any other layer j. Since the incomplete children of a layer are en- 
coded according to the order of their parents, this means that I{ 
will always appear in idi (L) before I\ for any layer i, helping to 
ensure an increasing tendency in idi(L). 
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3. Thirdly, some possible nontree edges cannot exist, since had they 
in fact existed they would have been included in the spanning tree. 
This constraint on the nontree edges is stated in the following ob- 
servation: 

Observation 11. For every nontree edge (u, v) it holds that 

vg{v'\ 3(u', v') £ E T : l{u) = l(u) A idi(u) < idi(u')} 

In other words: assume that there is a nontree edge [u, v) for 
which a right sibling u of u has the child v' in the spanning tree, 
then v 7^ v'. 

Proof. If v — v' then since u and u are on the same layer, and 
since idi(u) < idi(u) the spanning tree would contain (u,v) 
rather than (u 1 , v), which contradicts that (u, v) is a nontree edge. 
This is due to the fact that when the spanning tree is constructed 
by a traversal of the nodes in the BDD in layer order. □ 

Example 12. In Figure\2\an incomplete child of the node 5 can 
neither be 11, 12 or 13, since in that case the corresponding edge 
would be a tree edge, which contradict that the child is incomplete. 
On the other hand an incomplete child of 7 can be any of the nodes 
8, . . . , 12 since 7 is positioned to the right of all its siblings. 

What follows from Observation! 1 His, roughly stated, that incom- 
plete children with parents in the "left part" of a layer are bound 
to have one of the smaller layer ids in the layer, whereas the in- 
complete children with parents in the "right part" of the of a layer 
can have any layer id occurring in the layer. 

As a conclusion of three the reasons mentioned above about why we 
expect the sequence of incomplete children to tend towards being 
increasing, and as we have observed the increasing trend of id(L) 
in the instances we have tested on, we choose to exploit this fact by 
encoding the sequence idi (L) by delta coding: 

Definition 13 (Delta Coding). Consider any sequence of integers 

(ix, ifc) G Z fe for any k G N. We define the delta coding of 

(ii, . . . i k ) by A(ii, . . . ik) — (h, i-2 - ii,ia —h,.-.,ik — ik-i) 

For instance if idi(L) = (9, 12, 14, 15, 16, 17, 18, 20), then 
A(idi(L)) — (9, 3, 2, 1, 1, 1, 1, 2). Standard compression will be 
able to compress the latter sequence much better than the former se- 
quence. 

3.3.3 Long forward edges 

Using the encoding of Section [3.3.21 nontree long edges will often 
be expensive to encode since they have an incomplete child with an 
id that is a lot larger than for the short edges. Hence they will often 
result in large deltas in the delta coding. The following approach is an 
alternative way of encoding some of the long edges. This technique 
is therefore applied prior to the technique in Section [3,3,21 and only 
the remaining edges will be coded using delta-coding. 

A nontree edge (it, v) is a forward edge if u is an ancestor of v 
in the spanning tree. Consider any forward edge (u, v) in the graph 
with length k. This edge can be unambiguously encoded by idi(v) 
and the length of the edge, as u if the ancestor of v at layer l(v) — k. 

In order to know which nodes that are endpoints of forward edges 
we label each node u by the number of forward edges ending in v. 
This will be for most nodes and very seldom be more than 1, en- 
suring a good compression of the labelling. After this is done we 



encode the length of the forward edges. If there are very few long 
edges it might not be worth the effort to write the labelling on the 
edges. Hence we set a threshold on the number of forward edges that 
it needed in order to make the encoding of these edges useful. If the 
threshold is not exceeded all long forward edges are instead encoded 
as described in Section [3.3.2| 

3.4 An example 

As a final part in the description of our compression technique, we 
show how our technique would compress the BDD in Figure[2] 

Example 14. Consider the encoding of the BDD in Fig- 
ure [2] We first use Lemma [7] to encode the spanning tree 
as the bitstring (comma separated only to ease readability) 
11, 1111, 11011101, 101010000000, 010100, 1100, 0000. As there 
are no long edges in the spanning tree we do not need to encode 
layer information, we will output to denote that the total number 
of layer information that is to be added is 0. If we suppose a thresh- 
old of 5 in the encoding of nontree edges with high indegree only the 
node 19 will be encoded as an incomplete child with high indegree. 
This will be encoded as 19 19 19 19 19 19 19 

We are now left with two long forward edges of length 2, namely 
(14, 20) and (16, 20). To encode them we first specify which of the 
remaining nontree edges that are long forward edges by a bitvector 
0000010010 and the length 2, 2. Finally we encode the remain non- 
tree edges by 9 12 14 15 16 17 18 20 which in delta coding will be 
9 3 2 1 1 1 1 2. 

4 Experiments 

In this section we provide empirical results from compressing a large 
set of BDDs from various sources using the new encoder described 
in this paper and as well as the encoders from |8| and |5|. For fur- 
ther comparison we also provide the results from a naive encoder. 
The naive encoder outputs the size of each layer followed by a list of 
children. This representation is very similar to the in-memory repre- 
sentation of a BDD except that the layer information is not stored for 
each node but rather implicitly using the layer sizes. 

4.1 Instances 

Many of the instances we show results for are taken from the con- 
figuration library CLib | \2\. As a BDD only allows binary variables, 
additional steps must be taken in order to encode solutions to prob- 
lems containing variables with domains of size larger than 2. For each 
non-binary variable in a problem its customary to either use a number 
of binary variables logarithmic in the size of the domain of the vari- 
able and adjust the constraints accordingly or use one variable for 
each domain value. These methods are known as log-encoding |14l 
and direct-encoding respectively. In the instances we have tested with 
all those named with the suffix "dir" was compiled using direct en- 
coding, while the remaining were build using log-encoding. The in- 
stances fall into the following groups: 

Product Configuration The instances in this group are all BDDs 
compiled for use with standard interactive product configurators. For 
example the "renault" instance is a car configuration instance, and 
the others are various PC configuration instances. 



4 



Power Supply Restoration These instances were compiled for 
use in configuring the restoration of a power supply grid after a fail- 
ure. As such they are also a type of configuration instances. 

Fault Trees These are instances built for use in reliability analysis 
using fault trees. 

Combinatorial The combinatorial group contains various "toy" 
chess problems of a combinatorial nature. For example the classic 
problem of placing 8 queens on a chessboard without any piece 
being threatened is represented by the instance "8x8queen". The 
"5x27queens" instance models placing 5 queens on a 5x27 chess 
board. 

Multipliers This group contains two BDDs both of which repre- 
sent the value of the middle bit in the output obtained by multiplying 
two groups of 10 input bits \ 9\. These are build mixing the input bits 
("mult-mix- 10") and separating the input bits ("mult-apart-10"). 

4.2 Post compression 

All the tested encoders create an encoding that is meant to be sub- 
sequently compressed using standard entropy coding methods. In 1 8 1 
arithmetic coding is used while the choice of entropy coding is not 
discussed in |5|. To avoid the empirical results being affected by the 
choice of standard coding, we instead apply LZMA| 10 1 to the output 
of all encoders to produce the final encoding. Due to implementation 
details of this final compression step, it is sometimes beneficial to 
produce the output that has to be compressed on a byte level instead 
of a bit level. To ensure a fair comparison the results stated for (8), 
[ 5 1 and the naive approach are obtained by trying to output both on 
bit level and on byte level and stating the best compression among 
the two results. Our own encoder was only tested outputting on a 
byte level. 

4.3 Conclusions 

From the empirical results shown in Figure [3] we can immediately 
see that it is worthwhile to make use of a dedicated BDD encoder, 
as the naive encoding, being only compressed by LZMA, is outper- 
formed with a factor of up to 20 on some instances. Furthermore 
we can see that the encoder introduced in this paper is consistently 
able to perform as well or better than the other encoders on all tested 
instances. In particular the largest BDD in our test ("complex-P3") 
required about twice as much space when using either of the two 
other dedicated encoders. 

Instance properties For most of the instances included here it is 
the case that a very large fraction (30% to nearly 50%) of the edges 
lead to the zero-terminal. The exception to this are the multiplier 
instances, "5x27queens" and the "rook" instances. Slightly less ex- 
pected is the fact that it is quite rare for nodes other than the zero- 
terminal to have a significant in-degree, this only occurring with any 
great significance in "5x27queens" and to lesser extent in "complex- 
P2" and the multipliers. This means that in quite a few cases nearly 
all of the non-tree edges are simply edges to the zero-terminal, es- 
sentially turning S H into a bitvector, marking almost all the edges 
as zero-terminal edges, allowing for very efficient compression. This 
can be seen in the results where the "5x27queens", the rook and the 



multiplier instances all turn out to compress less efficiently. An ad- 
ditional important trend is that nodes which cannot be reached by 
following a short edge from a parent are very rare, meaning that our 
encoder in by far the most cases only need to provide layer informa- 
tion for less than 1% of the nodes, which is a significant advantage 
over previous encoders. 

Availability The Java source code used for these experiments 
(including a command-line encoder and decoder for BDDs in the 
BuDDy [6] file format) is available along with all instances used in 
these experiments at (URL removed for blind review). 
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Figure 3. The above table shows the name and size (in nodes) of each of 
the instances tested. The result of the new encoder, in bits per node, is then 
showed, followed by the relative results of the rest of the encoders. The * 
indicates results obtained from our encoder when delta coding is not used in 
the encoding of nontree edges. 
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